AITopics | fast approximate natural gradient descent

Collaborating Authors

fast approximate natural gradient descent

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Fast Approximate Natural Gradient Descent in a Kronecker Factored Eigenbasis

Neural Information Processing SystemsNov-20-2025, 22:08:16 GMT

For models with many parameters, the covariance matrix they are based on becomes gigantic, making them inapplicable in their original form. This has motivated research into both simple diagonal approximations and more sophisticated factored approximations such as KFAC (Heskes, 2000; Martens & Grosse, 2015; Grosse & Martens, 2016). In the present work we draw inspiration from both to propose a novel approximation that is provably better than KFAC and amendable to cheap partial updates. It consists in tracking a diagonal variance, not in parameter coordinates, but in a Kronecker-factored eigenbasis, in which the diagonal approximation is likely to be more effective. Experiments show improvements over KFAC in optimization speed for several deep network architectures.

fast approximate natural gradient descent, kronecker factored eigenbasis, name change, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.43)

Add feedback

Reviews: Fast Approximate Natural Gradient Descent in a Kronecker Factored Eigenbasis

Neural Information Processing SystemsOct-7-2024, 09:24:32 GMT

Summary The paper describes a generic 2nd order stochastic optimisation scheme exploiting curvature information to improve the trade-off between convergence speed und computational effort. It proposes an extension to the approximate natural gradient method KFAC where the Fisher information matrix is restricted to be of Kronecker structure. The authors propose to relax the Kronecker constraint and suggest to use a general diagonal scaling matrix rather than a diagonal Kronecker scaling matrix. This diagonal scaling matrix is estimated from gradients along with the Kronecker eigenbasis. Quality The idea in the paper is convincing and makes sense.

extension, fast approximate natural gradient descent, kronecker factored eigenbasis, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.40)

Add feedback

Fast Approximate Natural Gradient Descent in a Kronecker Factored Eigenbasis

George, Thomas, Laurent, César, Bouthillier, Xavier, Ballas, Nicolas, Vincent, Pascal

Neural Information Processing SystemsFeb-14-2020, 20:43:33 GMT

For models with many parameters, the covari- ance matrix they are based on becomes gigantic, making them inapplicable in their original form. This has motivated research into both simple diagonal approxima- tions and more sophisticated factored approximations such as KFAC (Heskes, 2000; Martens & Grosse, 2015; Grosse & Martens, 2016). In the present work we draw inspiration from both to propose a novel approximation that is provably better than KFAC and amendable to cheap partial updates. It consists in tracking a diagonal variance, not in parameter coordinates, but in a Kronecker-factored eigenbasis, in which the diagonal approximation is likely to be more effective. Experiments show improvements over KFAC in optimization speed for several deep network architectures.

approximation, fast approximate natural gradient descent, kronecker factored eigenbasis, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.46)

Add feedback